Data-adaptive inference of the optimal treatment rule and its mean reward. The masked bandit
نویسندگان
چکیده
This article studies the data-adaptive inference of an optimal treatment rule. A treatment rule is an individualized treatment strategy in which treatment assignment for a patient is based on her measured baseline covariates. Eventually, a reward is measured on the patient. We also infer the mean reward under the optimal treatment rule. We do so in the so called non-exceptional case, i.e., assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal treatment rule. This data-adaptive statistical parameter is worthy of interest on its own. Our main result is a central limit theorem which enables the construction of confidence intervals on both mean rewards under the current estimate of the optimal treatment rule and under the optimal treatment rule itself. The asymptotic variance of the estimator takes the form of the variance of an efficient influence curve at a limiting distribution, allowing to discuss the efficiency of inference. As a by product, we also derive confidence intervals on two cumulated pseudoregrets, a key notion in the study of bandits problems. Seen as two additional dataadaptive statistical parameters, they compare the sum of the rewards actually received during the course of the experiment with, either the sum of the means of the rewards, or the counterfactual rewards we would have obtained if we had used from the start the current estimate of the optimal treatment rule to assign treatment. A simulation study illustrates the procedure. One of the cornerstones of the theoretical study is a new maximal inequality for martingales with respect to the uniform entropy integral.
منابع مشابه
Estimating the Optimal Dosage of Sodium Valproate in Idiopathic Generalized Epilepsy with Adaptive Neuro-Fuzzy Inference System
Introduction: Epilepsy is a clinical syndrome in which seizures have a tendency to recur. Sodium valproate is the most effective drug in the treatment of all types of generalized seizures. Finding the optimal dosage (the lowest effective dose) of sodium valproate is a real challenge to all neurologists. In this study, a new approach based on Adaptive Neuro-Fuzzy Inference System (ANFIS) was pre...
متن کاملA Real Time Adaptive Multiresolution Adaptive Wiener Filter Based On Adaptive Neuro-Fuzzy Inference System And Fuzzy evaluation
In this paper, a real-time denoising filter based on modelling of stable hybrid models is presented. Thehybrid models are composed of the shearlet filter and the adaptive Wiener filter in different forms.The optimization of various models is accomplished by the genetic algorithm. Next, regarding thesignificant relationship between Optimal models and input images, changing the structure of Optim...
متن کاملEvaluation of the Efficiency of the Adaptive Neuro Fuzzy Inference System (ANFIS) in the Modeling of the Ionosphere Total Electron Content Time Series Case Study: Tehran Permanent GPS Station
Global positioning system (GPS) measurements provide accurate and continuous 3-dimensional position, velocity and time data anywhere on or above the surface of the earth, anytime, and in all weather conditions. However, the predominant ranging error source for GPS signals is an ionospheric error. The ionosphere is the region of the atmosphere from about 60 km to more than 1500 km above the eart...
متن کاملCoastal Water Level Prediction Model Using Adaptive Neuro-fuzzy Inference System
This paper employs Adaptive Neuro-Fuzzy Inference System (ANFIS) to predict water level that leads to flood in coastal areas. ANFIS combines the verbal power of fuzzy logic and numerical power of neural network for its action. Meteorological and astronomical data of Santa Monica, a coastal area in California, U. S. A., were obtained. A portion of the data was used to train the ANFIS network, wh...
متن کاملAPPLICATION OF ADAPTIVE NEURO FUZZY INFERENCE SYSTEM TO MODELING OXIDATIVE COUPLING OF METHANE REACTION AT ELEVATED PRESSURE
The oxidative coupling of methane (OCM) performance over Na-W-Mn/SiO2 at elevated pressures has been simulated by adaptive neuro fuzzy inference system (ANFIS) using reaction data gathered in an isothermal fixed bed microreactor. In the designed neuro fuzzy models, three important parameters such as methane to oxygen ratio, gas hourly space velocity (GHSV), and reaction temperature were conside...
متن کامل